Reduction of Bleed-through in Scanned Manuscript Documents

نویسندگان

  • Eric Dubois
  • Anita Pathak
چکیده

Many old manuscript documents were written on both sides of the paper, and the bleed-through from one side of the document to the other increases the difficulty in reading or deciphering the information on the page. This paper presents techniques for reducing such bleed-through distortion using techniques of digital image processing. Both sides of the document are scanned, maintaining full spatial and amplitude resolution (8 bits/sample). The bleedthrough is reduced by processing both sides of the document simultaneously. First the verso side is flipped from left to right, and then the recto and flipped verso images are registered. This registration is necessary since it is impossible to perfectly align the front and back when scanning the document, and the scanner may not be perfectly uniform. We used a six-parameter affine transformation to register the two sides, determining the parameters using an optimization method. Once the two sides have been registered, areas consisting primarily of bleed-through are identified and replaced by the background color or intensity. The method has been tested on a number of documents, including documents we generated under controlled conditions and some original manuscripts; the readability of documents with heavy bleed-through has been greatly improved by this method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bleed-through removal in degraded documents

This paper presents a linear-based restoration method for bleed-through degraded document images and uses a Bayesian approach for bleed-through reduction. A variation of iterated conditional modes (ICM) optimisation is used whereby samples are drawn for the clean image estimates, whilst the remaining variables are estimated via the mode of their conditional probabilities. The proposed method is...

متن کامل

Enhanced bleed through removal for scanned document images

Back-to-front interference is a common problem in documents, printed on translucent pages with insufficient opacity and is referred to as bleed through. The present state-ofart algorithms address bleed through based on entropy [1-3], entropic correlation [4] and discriminator analysis [5, 10]. However, a common drawback of such algorithms is their inefficient processing of documents that are ei...

متن کامل

Wavelet-based separation of nonlinear show-through and bleed-through image mixtures

This work addresses the separation of the nonlinear real-life mixture of images that occurs when a page of a document is scanned or photographed and the back page shows through. This effect can be due to partial paper transparency (show-through) and/or to bleeding of the ink through the paper (bleed-through). These two causes usually lead to mixtures with different characteristics. We propose a...

متن کامل

Methods for Written Ancient Music Restoration

Access to collections of cultural heritage is increasingly becoming a topic of interest for institutions like libraries. With the easy access to information provided by technologies such as the Internet, new ways exist for consulting ancient documents without exposing them to more dangers of degradation. One of those types of documents is written ancient music. These documents suffer from multi...

متن کامل

A Ground Truth Bleed-Through Document Image Database

This paper introduces a new database of 25 recto/verso image pairs from documents suffering from bleed-through degradation, together with manually created foreground text masks. The structure and creation of the database is described, and three bleed-through restoration methods are compared in two ways; visually, and quantitatively using the ground truth masks.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001